Refactor(FindSimiliar): MilvusCache to use Milvus Search API #352

srini-abhiram · 2025-10-06T04:57:14Z

Replaces manual similarity calculation and query-based retrieval in FindSimilar with Milvus's Search API for more efficient and accurate similarity search. Updates index creation to use the new HNSW index API. Improves cache hit/miss logic and error handling.

What type of PR is this?
refactor(FindSimilar): Migrate to Milvus for similarity search

What this PR does / why we need it:
This PR refactors the FindSimilar functionality to use the Milvus vector database for similarity search, replacing the previous manual calculation and query-based retrieval logic.

Key changes include:
Adopting Milvus Search API: All similarity search operations now leverage Milvus's native Search API, which is highly optimized for performance and accuracy.

HNSW Indexing: The index creation process has been updated to use the new HNSW (Hierarchical Navigable Small World) index API, which provides faster and more accurate search results for large-scale vector data.

Code Improvements: The caching logic has been streamlined, and error handling for interactions with the Milvus service has been made more robust.

This migration was necessary to improve the efficiency, scalability, and accuracy of our similarity search feature, reducing the maintenance overhead of the custom-built solution using Go.

Which issue(s) this PR fixes:
Fixes #150

Release Notes: No

netlify · 2025-10-06T04:58:02Z

✅ Deploy Preview for vllm-semantic-router ready!

Name	Link
🔨 Latest commit	`2ea24ed`
🔍 Latest deploy log	https://app.netlify.com/projects/vllm-semantic-router/deploys/68e4bac6568b47000856bb0a
😎 Deploy Preview	https://deploy-preview-352--vllm-semantic-router.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

srini-abhiram · 2025-10-06T05:00:31Z

If the code changes are fine, I can add a integration test for milvus cache. Please advice if my code is incorrect, Im open to criticism.

rootfs · 2025-10-06T14:30:28Z

@srini-abhiram this is cool! can you sign the DCO

In your local branch, run: git rebase HEAD~1 --signoff
Force push your changes to overwrite the branch: git push --force-with-lease origin issue-150

Replaces manual similarity calculation and query-based retrieval in FindSimilar with Milvus's Search API for more efficient and accurate similarity search. Updates index creation to use the new HNSW index API. Improves cache hit/miss logic and error handling. Signed-off-by: Srinivas A <[email protected]>

srini-abhiram · 2025-10-07T03:21:03Z

@rootfs I have followed your instructions and signed the commit.

Xunzhuo

looks good, thanks!

github-actions · 2025-10-07T07:01:34Z

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 `src`

Owners: @rootfs, @Xunzhuo, @wangchen615
Files changed:

src/semantic-router/pkg/cache/milvus_cache.go

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

srini-abhiram · 2025-10-07T10:52:40Z

@rootfs I haven't added the integration test case for milvus Search, I am working on it. Should I create a seperate PR when I'm done?

Xunzhuo · 2025-10-07T11:47:04Z

sure, plz go ahead in a separate PR, thanks! @srini-abhiram

srini-abhiram requested review from rootfs, Xunzhuo and wangchen615 as code owners October 6, 2025 04:57

rootfs approved these changes Oct 6, 2025

View reviewed changes

srini-abhiram force-pushed the issue-150 branch from dbe4332 to 38ffba0 Compare October 7, 2025 03:16

Xunzhuo approved these changes Oct 7, 2025

View reviewed changes

Merge branch 'main' into issue-150

2ea24ed

github-actions bot assigned rootfs, wangchen615 and Xunzhuo Oct 7, 2025

Xunzhuo merged commit 256c305 into vllm-project:main Oct 7, 2025
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor(FindSimiliar): MilvusCache to use Milvus Search API #352

Refactor(FindSimiliar): MilvusCache to use Milvus Search API #352

Uh oh!

srini-abhiram commented Oct 6, 2025

Uh oh!

netlify bot commented Oct 6, 2025 •

edited

Loading

Uh oh!

srini-abhiram commented Oct 6, 2025

Uh oh!

rootfs commented Oct 6, 2025

Uh oh!

srini-abhiram commented Oct 7, 2025

Uh oh!

Xunzhuo left a comment

Uh oh!

github-actions bot commented Oct 7, 2025

Uh oh!

Uh oh!

srini-abhiram commented Oct 7, 2025

Uh oh!

Xunzhuo commented Oct 7, 2025

Uh oh!

Uh oh!

Refactor(FindSimiliar): MilvusCache to use Milvus Search API #352

Refactor(FindSimiliar): MilvusCache to use Milvus Search API #352

Uh oh!

Conversation

srini-abhiram commented Oct 6, 2025

Uh oh!

netlify bot commented Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for vllm-semantic-router ready!

Uh oh!

srini-abhiram commented Oct 6, 2025

Uh oh!

rootfs commented Oct 6, 2025

Uh oh!

srini-abhiram commented Oct 7, 2025

Uh oh!

Xunzhuo left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Oct 7, 2025

👥 vLLM Semantic Team Notification

📁 src

🎉 Thanks for your contributions!

Uh oh!

Uh oh!

srini-abhiram commented Oct 7, 2025

Uh oh!

Xunzhuo commented Oct 7, 2025

Uh oh!

Uh oh!

netlify bot commented Oct 6, 2025 •

edited

Loading

📁 `src`